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Abstract. — Phylogenetic trees show a remarkable slowdown in the increase of number of lineages towards the present, a 
phenomenon which cannot be explained by the standard birth-death model of diversification with constant speciation and 
extinction rates. The birth-death model instead predicts a constant or accelerating increase in the number of lineages, which 
has been called the pull of the present. The observed slowdown has been attributed to nonconstancy of the speciation and 
extinction rates due to some form of diversity dependence (i.e., species-level density dependence), but the mechanisms 
underlying this are still unclear. Here, we propose an alternative explanation based on the simple concept that speciation 
takes time to complete. We show that this idea of "protracted" speciation can be incorporated in the standard birth-death 
model of diversification. The protracted birth-death model predicts a realistic slowdown in the rate of increase of number 
of lineages in the phylogeny and provides a compelling fit to four bird phylogenies with realistic parameter values. Thus, 
the effect of recognizing the generally accepted fact that speciation is not an instantaneous event is significant; even if it 
cannot account for all the observed patterns, it certainly contributes substantially and should therefore be incorporated into 
future studies. [Birth-death model; pull of the present; pull of the recent; protracted speciation; diversity dependence.] 



The temporal pattern of diversification has been of 
scientific interest for a long time (Yule 1924; Raup et 
al. 1973). Over the last two decades, methods have 
been developed to infer diversification rates from phy- 
logenies (Nee et al. 1992; Harvey et al. 1994; Nee et 
al. 1994a,b; Pybus and Harvey 2000; Nee 2001; Rick- 
lefs 2007). Even though phylogenies may ultimately not 
be sufficient to accurately estimate speciation and/or 
extinction rates (Paradis 2003, 2004: Etienne and Apol 
2009: Rabosky 2010) , the plot of the number of lin- 
eages in the phylogeny versus time, that is, the lineages- 
through-time (LTT) plot (Nee et al. 1992: Harvey et 
al. 1994: Pybus and Harvey 2000: Phillimore and Price 
2008), often shows a remarkable slowdown towards the 
present (McPeek 2008; Phillimore and Price 2008; Ra- 
bosky and Lovette 2008). In contrast, the standard birth- 
death model of diversification (Kendall 1948; Raup et al. 
1973; Nee et al. 1994a) shows an upward turn towards 
the present, which has been called the pull of the present 
(Nee et al. 1994b). To avoid possible confusion, the pull 
of the present is a phenomenon that is distinct from the 
pull of the recent (Raup 1979; Jablonski et al. 2003; Nee 
2006), which describes the apparently increased rate of 
diversification seen in the fossil record caused by more 
complete sampling of recent (and still extant) species. 
The pull of the present in LTT plots is purely a theoreti- 
cal phenomenon, a property of the standard birth-death 
model of diversification. It results from the fact that lin- 
eages arising in the recent past are less likely to have 
become extinct and therefore are more likely to be rep- 
resented in the phylogeny than lineages arising in the 
more distant past. 

Two explanations of the observed slowdown in LTT 
plots have been offered. The first is that it is due to 
a sampling artifact. Two sampling artifacts have been 



identified. Nee et al. (1994b) showed that taking a small 
sample from the actual phylogeny produces this slow- 
down; it transforms the upward turn predicted by the 
model into a downward turn. More recently Purvis et al. 
(2009) argued, on the basis of simulations with the pure 
birth (i.e., without extinctions, Yule 1924) model, that 
an apparent slowdown will be observed if there is age 
dependency in whether nodes are deemed to be speci- 
ation events. Sampling effects cannot explain, however, 
observed slowdowns in (nearly) complete phylogenies 
(Phillimore and Price 2008). The second explanation is 
diversity dependence, that is, species-level density de- 
pendence (Phillimore and Price 2008), the per species 
speciation and / or extinction rates are not constant as in 
the standard birth-death model but decrease with time 
due to niche filling (Schluter 2000; Ricklefs 2010). Al- 
though this is certainly a possibility it has also been ar- 
gued that in contrast, new species may actually create 
new niches (Odling-Smee et al. 2003). 

Here, we offer an alternative explanation of the slow- 
down which is an extension of the standard birth- 
death model in which speciation is assumed not to take 
place instantaneously but is allowed to take time. There 
is general agreement that speciation takes time (Avise 
1999). Speciation requires reproductive isolation, which 
could be either prezygotic (e.g., due to mate choice) 
or postzygotic (e.g., due to reduced hybrid fitness). 
Both prezygotic and postzygotic isolations are corre- 
lated with genetic distance between pairs of species, 
which is a strong indicator of time since divergence be- 
tween the species (Coyne and Orr 2004). There is clear 
evidence that a considerable amount of time may be 
needed to create the genetic distance required to dis- 
tinguish two 'good' species. For example, fertile and vi- 
able hybrids can exist, and old populations on islands 
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are much more likely to be recognized as taxonomically 
distinct than young populations (Price et al. 2010). Avise 
(1999) provides various estimates of upper and lower 
bounds to the duration of speciation (the upper bound 
being set by the divergence time of sister species, see 
also Rosindell et al. (2010), and the lower bound being 
set by the divergence time of phylogroups). For birds 
and mammals, he reports values between 1 and 3 Myr. 
In fish and herpetofauna, the reported rates are similar 
to those of birds and mammals but could in reality be 
much larger due to slower mitochondrial DNA clocks. 
Examples of 5 Myr exist in salamanders. Only in ex- 
ceptional cases, for example, polyploidy in plants, can 
speciation occur instantaneously. Most detailed genetic 
models of speciation also predict that speciation takes 
time (Gavrilets 2004). Here, in order to preserve gener- 
ality, we deliberately do not assume a specific mecha- 
nism for speciation but only recognize the simple fact 
that it is gradual rather than instantaneous. This form 
of speciation, termed "protracted speciation" by Rosin- 
dell et al. (2010), thus implicitly captures the outcome of 
what in reality are complex, ecological, and genetic pro- 
cesses, which given enough time lead to the birth of a 
new species (Schluter 2009). 

Protracted speciation has been shown to resolve prob- 
lems with the predictions of the neutral theory of bio- 
diversity on speciation rate and species longevities 
(Hubbell 2001; Rosindell etal. 2010). Here, we show how 
it explains the slowdown in LTT plots, in general, and 
when applied to four bird phylogenies. Moreover, we 
show that it can predict more imbalanced phylogenies 
than the standard birth-death model. We first study the 
pure birth model (i.e., no extinctions) with protracted 
speciation because it allows analytical treatment, which 
elegantly proves our point mathematically. We then ex- 
plore the birth-death model with protracted speciation 
by simulation. 

Results 

Model Predictions 

Pure birth model 

We start with the pure birth model or Yule (1924) 
model of diversification. We denote the number of 
species with N g where the subscript g will become clear 
later. At a constant rate Ai, species produce new species. 
There is no extinction in this model. Figure 1 A shows the 
pure birth process. The probability that at time t there 
are N g species is given by the following master equa- 
tion: 

dF[ ^ ] - M (N g - l)P[N g - 1 ; f] - AiN g P[Ng ; f] (la) 

with initial condition 

P[N g =N g (0);f = 0] = l (lb) 

This equation can be completely solved analytically, but 
here, we are only interested in the expected number of 
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FIGURE 1. The pure birth model a) with and b) without protracted 
speciation. Dotted lines indicate an incipient species and solid lines 
are good species, c) Phylogeny of the protracted pure birth process of 
panel b: only those lineages that have completed speciation before the 
present will show up in the phylogeny. Note that the branching points 
are at the times that the incipient species are produced, not at the times 
that they become good species. 



species at time t which obeys the ODE 

-^E[N g ; t] = AiE[N g ; t] (2a) 

with initial condition 

E[N g ;t = 0] =N g (0). (2b) 
The solution is straightforward: 

E[N g ;f]=Ng(0) Alt . (3) 

Because all species survive (no extinction), the expected 
number of ancestral lineages in the phylogeny, L, at time 
t for the good species that are extant the present time T 
is simply given by 

E[L ;t,T}= N g (0)e Al *. (4) 

The present time T is irrelevant for this model but will 
be relevant for the protracted form of this model. From 
Equation (4), we see that the logarithm of the number of 
lineages increases linearly with time t: 

lnE[L;f,T] = lnN g (0) + Ajf. (5) 

Protracted Pure Birth Model 

Now we make speciation a protracted process. That 
is, each extant species still produces new species at a 
rate Ai, but these new species are not yet good species. 
Instead, they are incipient species which become good 
species at a rate A2. This means that the time needed 
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to complete speciation is exponentially distributed with 
parameter A 2 , so the mean time it will take to com- 
plete speciation is Note that this protracted spe- 
ciation model differs slightly from that analyzed by 
Rosindell et al. (2010) who assumed a fixed time to com- 
plete speciation. If anything, a stochastically varying 
time to complete speciation seems more realistic. Incip- 
ient species give rise to new incipient species at rate A3 
while they are incipient. Figure IB shows this protracted 
pure birth process. Again, we can write down a master 
equation, but now for the probability P[IV g , Ni ; f] that at 
time t there are N g good (hence the subscript g) species 
and N[ incipient species: 



dP[N g) Ni;f] 
df 



AiNgP[Ng,Ni-l;t] 

+A 3 (N i -l)P[N g ,N i -l;f] 
+A 2 (Ni + l)P[N g - l,Ni + l;f] 
-(A 1 N g +(A 2 + A 3 )N i )P[N g ,N i ;t] 



with initial condition 



P[N g = NJ0),Ni = 0 ; t = 0] = 1. 



(6a) 



(6b) 



This model cannot be solved analytically for P[N g , Nj ; t], 
but we can write down expressions for the ex- 
pected number of good and incipient species at 
time f (see online Appendix SI, available from 
http: / /www.sysbio. oxfordjournals.org): 



^E[N g ; t] = A 2 E[Ni ; f], 



-E[Ni ; t] = AjEpVg ; t] + A 3 E[Ni ; f] - A 2 E[N; ; t](7b) 



with initial condition 



E[N g ; t 
E[Ni ; t ■ 



= 0] 
= 0] 



= N g (0), 
0 



(7c) 
(7d) 



This can be solved in general (see online Appendix SI), 
but here, we look at the special case where incipient 
species gives rise to new incipient species at the same 
rate as good species do (A3 = Ai) in which case, we have 
(see online Appendix SI): 



E[N g ;f] 
E[N;;f] 



exp(A 1 f) +1 ^ 



N 8 (0) 
1 + ^ 

»(»P(M) 



exp(— A 2 f), (8a) 



exp(-A 2 f)). 



(8b) 



Ai 



The expected number of ancestral lineages L at time t 
for the good species extant at time T is the sum of the 
expected number of good species and the expected 
number of incipient species which have at least one 
good descendant species before time T: 

E[L ;t,T]= E[N g ; f] + E[Ni ;*](!- P 0 (T - f))> (9) 
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FIGURE 2. Expected LTT plot for the protracted pure birth model 
(i.e., no extinction) for various values of the speciation completion rate 
A2. The value of the speciation initiation rate Ai is 0.5. The curve for 
^2=00 is barely visible, as it almost coincides with the curve for A2=10. 

where Pq(T — t) is the probability that none of the de- 
scendants becomes a good species. Figure 1C illustrates 
that not all incipient species contribute to L because not 
all incipient species leave good descendant species be- 
fore T. In online Appendix SI, we derive an analytical 
expression for Pn(f) : 



Po(t) 



1 + ^ 

Al 



1 + A2 e (Ai+A 2 )f 
Ai 



(10) 



Inserting this expression and Equation (8) in Equation 
(9) yields 



(7a) E[L;f,T]=N g (0) exp(Aif) 



.(Alt) 



,(-A 2 () 



1 + k e (A,+^)(T-f) 
Ai 



(11) 



We observe that the number of lineages increases less 
with t than in the pure birth model without protracted 
speciation (A 2 = oo), because the second term on the right 
hand side increases with t so the closer to the present, 
the more E[L ; t, T] will differ from the pure birth model. 
Figure 2 illustrates how the slowdown in increase of the 
number of lineages depends on the parameter A 2 . 

The formulas above assume that we start with N g (0) 
good species, but in practice, we look at a phylogeny 
starting with two lineages at crown age at which point at 
least one of the two lineages is incipient (because it orig- 
inates from the other at this point). In online Appendix 
SI, we derive analogous expressions for this initial 
condition. 



Birth-Death Model 

Starting from the pure birth model, but allowing 
for extinctions, one obtains the well-known birth-death 
model (Kendall 1948) for which the master equation 
reads: 



dP[N g ; f] 

df 



= Ai(N g - l)P[N g - 1 ; t] + m(N g + 1) 
P[N g + 1 ; f] - (Ai + m)N g P[N g ; t], (12a) 
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where \±\ is the extinction rate and the initial conditions 
are given by Equation (lb). See Figure 3, panels A and 
B, for an illustration of the birth-death process. The so- 
lution for this model can be obtained analytically with 
the method of characteristics. Here, we are interested in 
the expected number of species at time f, which is given 
by 

E[N g) f] = N g (0)e (Al "^ ) '. (12b) 



To compute the number of ancestral lineages L at time 
f of the species extant at time T, we first need the prob- 
ability Pf t) f 2/ that there are surviving lineages at time t% 
for a process starting at t\ with a single individual. This 
probability is given by (Kendall 1948; Nee et al. 1994a) 



1 



Hi 
A, 



- h,h 



1 _ Hi e -(Ai-w)('2-ti) : 



(13) 



assuming that Ai ^ The number of ancestral lineages 
at time t conditional on survival of the clade until the 
present time T is given by (Nee et al. 1994a, see also on- 
line Appendix S2) 



E stem [L;f,T]=N g (0)e( Al -^' 



1 - (1 



r )N g (0) ' 



(14) 



This expression is valid for the expected number of lin- 
eages when starting with N g (0) species at the stem age 
t = 0. In practice, we usually have data on crown age, 
the branching point of the first two ancestral lineages. 
To produce the model's expectations for a phylogeny 
with a prescribed crown age, we must require that both 
ancestral lineages survive because if only one survives, 
there may still be a phylogeny, but it does not have the 
prescribed crown age. Mathematically, starting with two 
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FIGURE 3. The birth-death model with and without protracted 
speciation. a) A birth-death process that is extinct before the present 
time T, an eventuality that most analyses are conditioned against, b) 
A birth-death process that survives up to the present time T. c) The 
birth-death process of b where speciation is protracted. Dotted lines 
indicate an incipient species and solid lines are good species, d) Phy- 
logeny of the protracted birth-death process of panel c. Only those 
lineages that have completed speciation or incipient lineages whose 
parent species has gone extinct before the present will show up in the 
phylogeny. 



lineages at the crown age t = 0 implies that we can 
simply take twice Equation (14) with N g (0) = 1: 



Uncrown [L- / T] — 2e^ 1 ^ 

m),t 



(15) 



Regardless of whether we use the stem age-based Equa- 
tion (14) or crown-age based Equation (15), the loga- 
rithm of the number of lineages increases more than lin- 
early with time: 



lnE[L ; t, T] ~ (Ai - m)£ + In P. 



(16) 



where we ignored all terms that do not depend on t 
because ¥ t j increases more than linearly with t. This 
means that the model predicts an upturn in number of 
lineages near the present. This phenomenon is called the 
pull of the present (Nee et al. 1994b) and can be seen in 
Figure 4. The verbal explanation is that recently arisen 
species have not had the time to become extinct caus- 
ing an apparent acceleration of diversification near the 
present. 



Protracted Birth-Death Model 

Making speciation protracted changes the master 
equation to 



dP[N g ,N i; f] 
dt 



= AiN g P[Ng, Ni - 1 ; t] + A 3 (Ni - 1) 

xP[N g ,Ni- l;f] +A 2 (Ni + l) 
xP[N g - l,Ni + 1 ; f] + m(N g + 1) 
xP[Ng + l,Ni;f] + ii 2 (Ni + l) 
xP[N g) N i + l;f]-((A 1 + Li 1 )Ng 
+ (A 2 + A 3 + Li2)N i )P[Ng,N i ;f]. (17) 

See Figure 3C for an illustration of the process. As in the 
protracted pure birth model, the master equation cannot 
be solved analytically. Nevertheless, it is again possible 
to obtain analytical expressions for the expected number 
of good and incipient species at time t . We will not write 
out these expressions and their solutions explicitly be- 
cause they are cumbersome to write, and, more impor- 
tantly, they cannot easily be used to obtain an expression 
for E[L ; t, T]. First of all, conditioning on survival to the 
present, as in Equation (14), is no longer trivial because 
this requires (a function of) P[N g , N; ; t] for which an an- 
alytical solution is lacking. Furthermore, the addition of 
the expectations of N g and Ni with a correction for the 
latter, as in Equation (9), no longer holds, because of the 
complicating factor that even if an incipient species has 
not become a good species by time T, it will be counted 
as a good species if its immediate ancestor was good but 
has become extinct (it simply replaces this extinct ances- 
tor, because as long as it has not completed speciation, 
it will be considered identical to the ancestor species). 
Figure 3D explains this. 

Because further analytical treatment seems extremely 
challenging, we simulated the process in order to gain 
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FIGURE 4. Expected LTT plots for the protracted birth-death model for various values of the extinction rate [i\ = 1x2 (upper panels) and the 
corresponding histograms of the slowdown statistic r (bottom panels). The lines are for different speciation completion rates A2. The value of 
the speciation initiation rate Aj is set at 0.5. The curve for A2 = 00 is barely visible, as it almost coincides with the curve for A2 = 10. 



insight in the behavior of the protracted birth-death 
model. Simulations also have the advantage that we can 
produce the (expected) number of lineages through time 
starting from the crown age, rather than the stem age, 
which require that conditioning should be done on the 
survival to the present of at least two lineages. 

The simulation procedure of the protracted birth- 
death model is straightforward: We used the Gillespie 
(1976) algorithm to simulate the master Equation (17). 
We started with one good species and one incipient 
species because we wanted to look at the results for 
crown age. Alternatively, one could start with two in- 
cipient species, but this does not affect the results qual- 
itatively. For every newly arisen incipient species, we 
recorded, during the simulation, the exact time when it 
arose (time of birth), from which species it arose (par- 
ent species), when it completes speciation (time of mat- 
uration), and when it becomes extinct (time of death), 
noting that speciation completion and extinction need 
not occur. At the end of the simulation (at T = 15 Myr 
in our simulations), we constructed the phylogeny from 
this information to check whether the phylogeny had 
the initial good and incipient species as common an- 
cestors (the common ancestors could be younger which 
must be ruled out because all iterations of the simulation 
must have the same crown age so that they can be aver- 
aged). If not, the phylogeny was ignored and the whole 
procedure was repeated. 

Constructing this phylogeny is less straightforward 
than running the simulation. To obtain the phylogeny, 
we counted all extant species regarded as good species 
at the last event in the simulation (time £). Of course, 
good species qualify to be regarded as good, but there 
is a special case where an incipient species must also 
be regarded as good: when it arose from a good species 
that has since become extinct (Fig. 3C). However, when 



several incipient species all arose from a now extinct 
good species, only one should be counted as a good 
species; here, we adopted the convention of always 
choosing the youngest orphaned incipient species. To 
find the branching times of the phylogeny, the lineages 
considered good at time t were followed backwards in 
time until the birth event of one of these lineages (as an 
incipient species) was encountered. We define the age 
of a species as the time that passes between birth and 
death (extinction), not the time between maturation and 
death (extinction). This definition most closely matches 
the way real phylogenies are constructed. We continued 
the backwards search for birth events until one lineage 
remained at the crown age: The point just before the 
birth event of the second still extant lineage. In all our 
simulations, we assumed that incipient species produce 
new incipient species at the same rate as good species 
do (A3 = Ai), but our code allows for different values 
for these rates (A3^=Ai). Online Appendix S3 contains 
a pseudo-code for the simulations which explains the 
construction of the phylogeny in more detail. A Matlab 
code is available upon request from the corresponding 
author. 

Figure 4 shows the LTT plots for various extinc- 
tion rates (where extinction rates of incipient and good 
species are identical). Furthermore, it shows histograms 
of the slowdown statistic Ar which is calculated as 
(Pigot et al. 2010) 



In 



Ar = 



L(lT) 



-In 



L(|T) 
L(0) 



In 



1(7) 
mi) 



+ ln 



L(0) 



(18) 



This is a better statistic than the often-used -^-statistic 
which depends on the size of the tree. It is clear that, as 
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time (Myr) time (Myr) time (Myr) 

FIGURE 5. Expected LTT plots for the protracted birth-death model for various values of the incipient species extinction rate [1.2, where the 
extinction rate of good species is set at = 0.4. The lines are for different speciation completion rates A2. The value of the speciation initiation 
rate Ai is set at 0.5. 



in the protracted pure birth model, making speciation 
protracted (lower values of A2) causes a slowdown in 
the increase of the number of lineages through time. Re- 
markably, not only does the mean of the measure of 
slowdown Ar shift to the left but also the variance of 
Ar becomes larger for smaller values of A2 (longer mean 
time to complete speciation). 

In Figure 4, the extinction rates of incipient and good 
species are set equal. This may not be realistic (see Dis- 
cussion section). Figure 5 shows the effect of different 
extinction rates for incipient and good species. When 
the incipient species extinction rate is high, protracted 
speciation no longer causes a slowdown in the LTT 
plot. The reason is simple: When incipient species are 
highly likely to go extinct, incipient species that manage 
to become good species will necessarily have to do so 
quickly, that is, they should take a short time to complete 
speciation. The situation is then similar to the birth- 
death process without protracted speciation, with the 
corresponding pull of the present. 



x in terms of the model parameters (see online Ap- 
pendix S4): 



T = 



D — A? + 



M-2 



log 



1 + 



-A3+IX2 



(19) 



where 



D = yJ(X 2 + A3) 2 + 2(A 2 - A 3 ) H2 + n|. 



(20) 



If A3 = 0, this expression simplifies to (see online Ap- 
pendix S4) 



A 2 + H2 



and for \x 2 = 0, we have 



t = — log 
A3 



1 + 



A 3 



(21a) 



(21b) 



Speciation Completion Time and Duration of 
Speciation 

The parameter A 2 measures the rate of incipient 
species completing speciation, and therefore, its inverse, 
measures the time to complete speciation. However, 
this is not the same as the duration of speciation that 
is measured from speciation events that have actually 
occurred because some incipient species may never be- 
come good species because they go extinct (at rate 1x2 in 
our model) and because each incipient species may pro- 
duce incipient species itself (at rate A 3 in our model) that 
complete speciation before their parent does. One can 
derive an expression for the mean duration of speciation 



Except for this last case, the distribution of the mean du- 
ration of speciation is no longer exponential and may 
have an interior mode (see online Appendix S4). 



Model Fit to Data 

An exact expression for the likelihood of a phylogeny 
(and therefore an LTT) can be derived in the special case 
A3 = Ai and \i\ = \i 2 = 0 (see online Appendix S5): 



P(tree) = (N g - 1) ! K^U ^pt.) U P ^- < 22 ) 



P 0 (Xj) 
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time(Myr) time(Myr) 



FIGURE 6. LTT plots (stars) of four bird phylogenies, selected from Phillimore & Price (2008) (see text for selection criteria) and the fits of the 
protracted birth model (gray) and the protracted birth-death model (black). The former is obtained through maximum likelihood, the latter by 
least squares (see text). We assumed that = M-i an d A3 = Aj, so the fitted parameters are A], A2, and m. 



where the x, are the branching times and 
Aj + A 2 



Po(t) = 
Pi(t) = 



A 2 (Ai + A 2 )(e( Al+A2 >'-l)e A2t 
(Aj + Af 1+A2)t ) 2 



(23a) 
(23b) 



Figure 6 shows the fit of the protracted birth model 
(in green) to four bird phylogenies (Acanthiza, Craci- 
dae, Myiborus, and Toxostoma) using this likelihood. The 
phylogenies were selected from the bird phylogenies of 
Phillimore and Price (2008) with the criteria that a slow- 
down must be clearly visible in the data and that there 
are no missing species. Table 1 contains the parameter 
estimates. 

We have not yet been able to find an expression 
for the likelihood for the protracted model with ex- 
tinction because of difficulties defining good species 
(incipient species with extinct good parents must be 
considered good). Therefore, we employed a different 
fitting method to fit the model to four bird phyloge- 
nies. The fitting method is a least-squares fit of the 



full LTT plot: Using a simplex optimization algorithm, 
we searched for the parameters that minimize the dis- 
tance between the observed LTT plot and the expected 
LTT plot, where the latter was obtained by simulation 
(10,000 iterations with a different seed for each itera- 
tion and the seeds fixed during the optimization to min- 
imize noise). We found that we could not estimate the 
parameters reliably because the fit did not show an ob- 
servable change across a wide range of extinction rates, 
but the speciation initiation and speciation completion 
rates did depend on the value of the extinction rate. Fig- 
ure 6 shows the fit for an extinction rate of 0.3 Myr -1 , 
whereas Table shows the corresponding parameter esti- 
mates of speciation initiation and speciation completion 
rates and the durations of speciation calculated from 
these rates with Equation (19). The durations of speci- 
ation are definitely in the right ball park, but because 
of the difficulty in accurately estimating the parameters, 
they should be interpreted with care. 

It may seem that the fit of the protracted birth model, 
using maximum likelihood, is not as good as the fit of 
the protracted birth-death model, using least squares, 



TABLE 1. Parameter estimates of four bird phylogenies for the protracted birth model (pb, using likelihood) and the protracted birth-death 



model (pbd, using least squares) 


Data set 


Model 


Aj = A 3 (Myr - 1 ) 


A 2 (Myr - 1 ) 


Hi = [i 2 (Myr J ) 


GOF 


x (Myr ) 


Acanthiza 


pb 


0.47 


0.04 


0 


-7.89 


5.16 




pbd 


0.66 


0.07 


0.3 


0.35 


3.81 


Cracidae 


pb 


0.96 


0.12 


0 


1.13 


2.31 




pbd 


1.09 


0.16 


0.3 


0.47 


1.95 


Myiborus 


pb 


0.48 


0.89 


0 


-2.71 


0.89 




pbd 


0.81 


0.39 


0.3 


0.41 


1.29 


Toxostoma 


pb 


0.43 


0.05 


0 


-8.34 


5.19 




pbd 


0.65 


0.06 


0.3 


0.24 


3.98 



Because the extinction rate cannot be estimated reliably (the goodness of fit [GOF] did not change over a wide range of extinction values), 
the reported value of 0.3 was set beforehand. Also shown are the corresponding values of the GOF statistic, i.e. maximum likelihood and 
least-ssquared distance, and the estimated average duration of speciation (t). 
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but this is only apparent. Maximizing the likelihood 
finds the parameters that make the observed data set 
the mode of the probability distribution of phylogenies, 
whereas the least-squares approach finds the parame- 
ters that make it the mean. Expected LTT plot are, by 
definition, based on the mean. 

Discussion 

We have shown that protracted speciation, which only 
assumes that speciation takes time rather than occurs in- 
stantaneously, can explain the observed slowdown in 
LTT plots. The verbal argument is simple: Speciation 
events that initiated in the recent past may not have 
completed yet, so they do not count towards the total 
number of extant species at the present. Stated other- 
wise, to find branching points in the phylogenetic tree, 
one has to look back into the past at least as far as the 
time needed for speciation to complete, which is on av- 
erage This causes many recent branching points to 
disappear, but deeper branching points will be counted 
as producing good lineages because they almost always 
leave enough time for speciation to complete. 

Not all real phylogenies show a slowdown in diver- 
sification. This may be considered to be at odds with 
our results, but they are actually fully consistent with 
them. Figure 4 shows the distribution of the slowdown 
statistic Ar across 10,000 phylogenies simulated with 
the same parameter set. Whereas the bulk of the simu- 
lations show negative values of the slowdown statistic, 
some of them have slowdown parameters larger than 0. 
Also, the protractedness parameter A2 need not be the 
same for all clades because the speciation process may 
be different across different clades. 

In our model, we have assumed the simplest form of 
protracted speciation, where good species give rise to in- 
cipient species at a constant rate Ai and incipient species 
complete their speciation at a constant rate A2. The latter 
means that the time to complete speciation is exponen- 
tially distributed. Naturally, speciation is a much more 
intricate process than these simple assumptions suggest. 
An incipient species may have to go through several 
stages before it is considered a good species, for ex- 
ample, if speciation requires an accumulation of muta- 
tions (Gavrilets 2004). This can be modeled by assuming 
higher order incipient species Ny (where j is the order) 
that transform into one another with constant rates A2,/. 
We then look at the dynamics of P[Ng,N u , . . . ,N ir „,t] 
when the highest order is n. The time spent in each of 
these incipient states is still exponentially distributed 
with parameter A2,/, but the total time to complete spe- 
ciation, which is the sum of the times spent in each in- 
cipient state, is no longer exponentially distributed. For 
example, when all rates are identical (A2J = A2 for all 
j = 1, . . . , n), then the total time to complete speciation 
is gamma distributed (Akkouchi 2008): 

/( ' ;A2 ' M) =(S< e " 2t < (24) 



whereas for different parameters A 2) y, the probability 
distribution is 

/(f;A 2 , lj ...,A 2i „,n) = ^A 2 , / e-^ f[ \ . (25) 

These distributions, unlike the exponential distribution, 
are hump shaped with their mode at time larger than 0 
and very flexible, so a wide variety of speciation modes 
could be incorporated. Such alternative distributions for 
the time for speciation to complete will have conse- 
quences for the quantitative shape of the LTT plot, but 
we expect that it will not change our results qualita- 
tively: There will still be a slowdown of the increase of 
the number of lineages. For example, consider a model 
with n incipient states with identical rates nA2. We ex- 
pect the slowdown to be more pronounced because 
while the mean time to complete speciation is still the 
modal time to complete speciation, ^=1, is larger than 
0. Furthermore, note that for this model, the larger the 
number of incipient stages, the more peaked is the dis- 
tribution of the speciation-completion time. In the limit 
of an infinite number of incipient stages, the speciation 
completion time becomes fixed. This is the case consid- 
ered in Rosindell et al. (2010) in the context of the neutral 
theory of biodiversity. 

The parameter \i2 is the extinction rate of incipient 
species. It can be argued that incipient species are more 
likely to become extinct (and hence have higher [12 val- 
ues) because of smaller population size or because gene 
flow causes the incipient species to merge with its par- 
ent (McPeek 2008) . However, one could also argue that 
incipient species have a lower extinction rate because 
they are likely to fill a new niche in which there is less 
competition. In reality, both processes may play a role. 
A possible scenario is that initially the extinction rate 
is high, but this extinction rate decreases as time goes 
by. This can all be incorporated in the protracted speci- 
ation model with multiple incipient states. For example, 
in a model with two incipient states, the first state has a 
higher extinction rate than the second. 

Not only can protracted speciation explain slowdown 
in LTT plots, it can also predict more imbalanced trees 
than the standard birth-death model, in line with obser- 
vations (Blum and Francois 2006; Phillimore and Price 
2008), if A3 < Ai and A 2 is small (< 1). This is not due 
to protractedness per se but due to the fact that it takes 
longer for a newly arisen incipient species to speciate 
further than for a good species when A3 < Ai. This in- 
duces a time lag which has been shown to change tree 
imbalance in simulations (Losos and Adler 1995; Rogers 
1996) and analytically (Pinelis 2003). We show this result 
for A3 = 0 in online Appendix S6. 

The protracted speciation model is a parsimonious ex- 
planation for the slowdown in diversification that has, 
to the best of our knowledge, gone unnoticed in the 
literature. Only Weir and Schluter (2007) constructed 
a model which, in hindsight, can be interpreted as an 
early protracted speciation model, but it was not used to 
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explain slowdowns in clade diversification. In fact, our 
model now allows for a rigorous reanalysis of their pio- 
neering work, relating the latitudinal diversity gradient 
to a latitudinal trend in the duration of speciation. Our 
protracted speciation model bears some resemblance to 
the explanations in terms of diversity dependence as 
well as age dependence of nodes. Diversity dependence 
induces a nonconstant rate of speciation, and, likewise, 
protracted speciation induces nonconstant rate of spe- 
ciation where the speciation rate decreases near the 
present day to discount any speciation events that have 
started, but not yet completed. The main difference be- 
tween our model and the diversity-dependence model 
is that in the latter, the speciation rate depends on the 
number of extant species which naturally increases with 
time. Diversity dependence means that the species are 
no longer independent from one another, and hence, the 
analytical treatment of the diversity-dependent birth- 
death model is challenging; recent work in the field 
(Rabosky and Lovette 2008) has been criticized for mak- 
ing unrealistic assumptions for the sake of tractability 
(Bokma 2009). Analytical treatment of the protracted 
speciation model interpreted as a time-dependent birth- 
death model is also not trivial because the distribution 
of speciation completion times cannot be simply trans- 
lated into a time-dependent speciation rate, but perhaps, 
this interpretation provides good approximations and 
the idea merits further study. The second explanation, 
of age dependence of nodes (Purvis et al. 2009), is ac- 
tually an emergent property of protracted speciation. 
Missing species are not interpreted as sampling effects 
but as unfinished speciation processes. In fact, all ex- 
planations for the slowdown in diversification must in- 
volve a mechanism for lowering the speciation rate at 
times near the present. That diversification rates vary in 
time is generally accepted (Bokma 2003), but the ques- 
tion is why they should mostly be decreasing right now. 
The answer is inherent in the explanations in terms 
of sampling effects and protracted speciation, but it is 
a separate assumption in the explanation in terms of 
diversity dependence. Diversity dependence certainly 
plays a role, but it may be confined to specific cases, 
for example, Rabosky and Lovette 2008, whereas pro- 
tracted speciation seems a more general phenomenon. 
The only way to find out is to construct a model that 
contains both protracted speciation and diversity de- 
pendence and compare it to a model with just protracted 
speciation. 

We used a least-squares approach to estimate model 
parameters in the protracted birth-death model because 
we have not yet been able to derive a full likelihood for- 
mula for this model when extinction is nonzero. Such 
a likelihood formula is useful for a statistically sound 
estimation of parameters from given phylogenies and 
comparison of the performance of different models (al- 
though maximum-likelihood-based estimated tend to 
be biased). Nevertheless, the simplicity of protracted 
speciation leaves us hopeful that such a likelihood for- 
mula will become available for our model or a variant 
that also contains the essence of protracted speciation. 



However, even if the likelihood can be computed, the 
estimates are unlikely to be very accurate because we 
expect the likelihood surface to be fairly flat across a 
wide range of extinction rates. This expectation is based 
on results with the least-squares approach where the 
least-squares distance between model and data showed 
no observable change across a wide range of extinction 
rates. Our inability to accurately estimate parameters 
will be even more pronounced in more realistic models 
with Ai ^ A3 and ^= \i2 or a more realistic distribution 
of speciation completion times because of the higher 
number of df . Consequently, the phylogeny alone might 
not allow accurate estimation of the key parameters and 
the relative contributions of protracted speciation and 
other mechanisms such as diversity dependence in ex- 
plaining the slowdown. This may require development 
of methods employing additional data such as the fos- 
sil record (Paradis 2003, 2004; Etienne and Apol 2009; 
Purvis et al. 2009) . 

We have shown that the observed slowdowns in di- 
versification can be explained with the simple assump- 
tion that speciation takes time. In no way do we exclude 
the possibility or importance of diversity-dependent di- 
versification; rather, we regard protracted speciation as 
an alternative, possibly complementary parsimonious 
explanation that cannot be ignored in future models of 
diversification. 
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Supplementary material, including data files 
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http: / / www.sysbio.oxfordjournals.org. 
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